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(54) METHODS FOR DETECTING AND ISOLATING NUCLEAR TRANSPORT PROTEINS 

(57) Whether or not a test DNA encodes a peptide 
with nuclear transportability can be readily detected by 
introducing a fusion DNA formed by a DNA encoding a 
transcription factor whose nuclear transportability is 
eliminated and the test DNA into a eukaryotic host hav- 
ing in its nucleus a promoter region that is activated 
when said transcription factor binds thereto and a 
reporter gene whose expression is induced by said pro- 
moter region, and detecting the expression of said 
reporter gene. A DNA encoding a peptide with nuclear 
transportability is efficiently and systematically isolated 
by isolating a test DNA from cells expressing the 
reporter gene. 
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Descripti n 

Technical Field 

5 [0001] The present invention relates to methods for detecting and isolating nuclear transport proteins and belongs 
to the field of genetic engineering, in particular, the field of gene cloning. 

Background Art 

10 [0002] Various transcription factors, nuclear receptors, signal transduction factors, chromatin receptorsetc. are 
known as nuclear transport proteins. These proteins directly or indirectly interact with specific regions of ON A near the 
end of the intracellular signal transduction cascade, regulate gene expression, and replicate DNA to thereby determine 
the behavior of cells. Hence, to isolate the genes encoding these nuclear transport proteins and to analyze the function 
thereof is thought to be very important for clarifying various life phenomena or developing novel drugs. 

15 [0003] However, special methods for systematically cloning cDNAs encoding nuclear transport proteins have not 
been developed, and so far, general methods applying cloning techniques have been used. In particular, a method in 
which a cDNA library is screened based on some information about the protein to be cloned, such as the protein having 
sequences conserved at the amino acid level (Uchtsteiner, S.. Proc. Natl. Acad. Sci., 1993, 90: 9673-9677), the DNA 
sequence interacting with the protein (Sanz, L, Mol. Cell. Biol., 1995, 15: 3164-31 70; MATCHMAKER One-Hybrid Sys- 

20 tern (CLONTECH)), or proteins interacting with the protein to be cloned, has been employed when such information is 
available. In this case, screening has been possible only in extremely limited ranges. 

[0004] For example, the two-hybrid system (Gyuris, J. t Cell. 1993, 75: 791-803; Golemis, E. A., Current Protocols 
in Molecular Biology (John Wiley & Sons, Inc.), 1996, Ch. 20.0 and 20.1), which has recently been developed as a 
method for isolating proteins interacting with a protein, can indirectly screen the cDNAs encoding proteins interacting 

25 with a protein that has been known to coexist in the nucleus with said protein as a bait. However, it can not be used for 
directly screening the cDNAs encoding proteins with nuclear transportability, in addition, even if a protein known to exist 
in the nucleus is used as a bah (Jordan, K.L., Biochemistry, 1996, 35: 12320-12328). cDNAs encoding proteins other 
than nuclear proteins may be isolated because it is unknown whether the proteins interact in the cytoplasm before their 
transportation to the nucleus or whether they actually interact in the nucleus. This requires a laborious procedure to 

30 confirm whether the cDNAs isolated encode nuclear transport proteins or not. Besides, since the two-hybrid system 
uses the interaction between proteins as an indication, the proteins obtained through the screening are limited to those 
that can interact with the protein used as a bait. 

[0005] When information about a desired protein as mentioned above is not obtained, the only method of screening 
a desired cDNA is to purify the protein from the nuclear fraction of cells using some function of the protein, such as spe- 
35 cific bioactivity, and screening a cDNA library based on the sequence information of the protein obtained (Ostrowski, J., 
J. Biol. Chem.. 1994, 269: 17626-17634). However, the purification of such proteins often takes much labor and time 
and, in some case, is substantially impossible since the expression level of some nuclear transport proteins is very low. 

Disclosure of the Invention 

40 

[0006] An objective of the present invention is to provide a method for easily and efficiently detecting and isolating 
a cDNA encoding a peptide having nuclear transportability. 

[0007] An example of nuclear transport proteins is a transcription factor. Transcription factors of eukaryotes induce 
the expression of a specific gene by interacting with the promoter region of the said specific gene after their transpor- 

45 tation to the nucleus. The nuclear transportability of a transcription factor is attributed to the nuclear transport signal 
existing in the transcription factor. The present inventors studied how to attain the objective mentioned above by focus- 
ing on two properties of transcription factors, the nuclear transportability and the ability activating the transcription of a 
specific gene. We thought that, when a fusion protein obtained by removing the region having nuclear transportability 
from a transcription factor and introducing an unknown peptide therefor is expressed, the fusion protein is transported 

so into the nucleus and interacts with a specific promoter region to induce the expression of the specific gene downstream 
if said unknown peptide in the fusion protein has nuclear transportability. In contrast, if the unknown peptide in the fusion 
protein has no nuclear transportability, the fusion protein is not transported into the nucleus and does not induce the 
expression of the gene downstream of the specific promoter. Specifically, the present inventors thought that whether an 
unknown peptide in a fusion protein has nuclear transportability or not could be judged by observing if the fusion protein 

55 in which the unknown peptide is fused to a transcription factor without nuclear transportability induces the expression 
of the gene downstream of the specific promoter. 

[0008] Based on such ideas, the present inventors prepared a fusion DNA formed by the DNA encoding a transcrip- 
tion factor from which the region having the nuclear transportability had been r moved and the test DNA. We then intro- 
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duced the fusion DNA into an ukaryotic host having in its nucleus a promoter region that is activated when a 
transcription factor binds thereto and a report r gene whose expression is induced by the activation of said promoter 
region. Finally, we detected the expression of said reporter gene. As a result, the present inventors found that the 
expression of the reporter gene is induced when the test DNA encoding a peptide with nuclear transportability is used 

5 but not when the test DNA encoding a peptide without nuclear transportability is used. 

[0009] The present inventors next prepared a library of cDNAs encoding fusion proteins formed by a transcription 
factor from which the region having the nuclear transportability had been removed and another peptide. We subse- 
quently introduced the library into cells and screened cDNAs encoding peptides having the nuclear transportability 
using the expression of the reporter as an indication. As a result, the present inventors found that many of the known 

10 cDNAs among the cDNAs isolated from the cDNA library encode proteins that are thought to have nuclear transporta- 
bility. 

[0010] The present invention relates to a method for easily and efficiently detecting and isolating a DNA encoding 
a peptide having nuclear transportability using the properties of a transcription factor, and more specifically, to 

75 (1) a method for detecting nuclear transportability of a peptide encoded by a test DNA, the method comprising intro- 
ducing a fusion DNA formed by a DNA encoding a transcription factor without nuclear transportability and the test 
DNA into an eukaryotic host having in its nucleus a promoter region that is activated when said transcription factor 
binds thereto and a reporter gene connected to the downstream of said promoter region, and detecting expression 
of said reporter gene; 

20 (2) the method of (1 ). wherein the transcription factor without nuclear transportability is a fusion protein comprising 
a nuclear export signal, a DNA binding domain, and a transcription activating domain; 

(3) the method of (1). wherein the transcription factor without nuclear transportability is a fusion protein comprising 
a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain, and the promoter region acti- 
vated when said transcription factor binds thereto is that of a GAL1 gene whose operator sequence is replaced with 

25 that of LexA; 

(4) the method of (3), wherein the nuclear export signal is a peptide comprising the amino acid sequence set forth 
in SEQlDNO:5; 

(5) the method of any one of (1) to (4), wherein the reporter gene is LEU2 and/or a 0-galactosidase gene; 

(6) a method for isolating a DNA encoding a peptide with nuclear transportability, the method comprising introduc- 
30 ing a fusion DNA formed by a DNA encoding a transcription factor without nuclear transportability and a test DNA 

into a eukaryotic host having in its nucleus a promoter region activated when said transcription factor binds thereto 
and a reporter gene connected downstream of said promoter region, detecting the expression of said reporter 
gene, and isolating the test DNA from the eukaryotic host in which the expression has been detected; 

(7) the method of (6), wherein the transcription factor without nuclear transportability is a fusion protein comprising 
35 a nuclear export signal, a DNA binding domain, and a transcription activating domain; 

(8) the method of (6), wherein the transcription factor without nuclear transportability is a fusion protein comprising 
a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain, and the promoter region acti- 
vated when said transcription factor binds thereto is that of a GAL1 gene whose operator sequence is replaced with 
that of LexA; 

. 40 (9) the method of (8), wherein the nuclear export signal is a peptide comprising the amino acid sequence set forth 
in SEQ ID NO: 5; 

(10) the method of any one of (6) to (9), wherein the reporter gene is LEU2 and/or a p-galactosidase gene, 

(1 1) a vector comprising a DNA encoding a transcription factor without nuclear transportability and an introduction 
site for a test DNA adjacent thereto; 

45 (12) the vector of (1 1 ) , wherein the transcription factor without nuclear transportability is a fusion protein comprising 
a nuclear export signal, a DNA binding domain, and a transcription activating domain; 

(13) the vector of (1 1 ), wherein the transcription factor without nuclear transportability is a fusion protein comprising 
a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain; 

(14) the vector of (13), wherein the nuclear export signal is the peptide comprising the amino acid sequence set 
so forth in SEQ ID NO: 5; 

(15) a kit comprising (i) a vector comprising a DNA encoding a transcription factor without nuclear transportability 
and an introduction site for a test DNA adjacent thereto, and (ii) a eukaryotic host having in its nucleus an expres- 
sion unit comprising a promoter region activated when said transcription factor binds thereto and a reporter gene 
connected to the downstream of said promoter region; 

55 (16) the kit of (1 5), wherein the transcription factor without nuclear transportability is a fusion protein comprising a 
nuclear export signal, a DNA binding domain, and a transcription activating domain; 

(17) the method of (15), wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain, and the promoter region 
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activated when said transcription factor binds thereto is that f a GAL1 gene whose operat r sequence is replaced 
with that of LexA, and the eukaryotic host is yeast; 

(18) the kit f (17), wher in the nuclear export signal is a peptide comprising the amino acid sequence set forth in 
SEQ ID NO: 5; and 

5 (19) the kit of any one of (15) to (18). wherein the reporter gene is LEU2 and/or a p-gaJactosidase genes 

[001 1 ] The term "transcription factor" used herein means a protein that has a DNA binding domain and a transcrip- 
ti n activating domain and that activates the transcription of a specific gene. It is not limited to a natural protein. The 
t rm "peptide" used herein includes not only a protein but also a partial peptide of a protein, a synthetic peptide, etc. 

ro [001 2] The first aspect of the present invention relates to a method for detecting nuclear transportability of a peptide 
encoded by a test DNA. The method comprises introducing a fusion DNA formed by a DNA encoding a transcription 
factor without nuclear transportability and the test DNA. This combination is then introduced into a eukaryotic host hav- 
ing in its nucleus a promoter region that is activated when said transcription factor binds thereto and a reporter gene 
whose expression is induced by the activation of said promoter region. The expression of said reporter gene is then 

is detected. 

[0013] The transcription factor used in the present invention to prepare a transcription factor without the nuclear 
transportability is not limited as long as the transcription factor can specifically regulate the expression of a gene in a 
ukaryote. Examples are GAL4 (Giniger, E., Cell, 1985, 40: 767-774), p53 (Chumakov. P.M., Genetika, 1988, 24: 602- 
612), GCN4 (Hinnenbusch, A. G., Proc. Natl. Acad. ScL, 1984, 81 : 6442-6446), VP16 (Triezeneberg, S. J., Genes. Dev., 
20 1988, 2: 718-729), RelA (Nolan, G.P., Cell, 1991 , 64: 961-969), Oct-1 (Strum. R. A., Genes. Dev., 1988, 2: 1582-1599), 
c-Myc (Watt, R., Nature, 1983, 303: 725-728), c-Jun (Angel. P., Cell, 1988, 55: 875-885). and MyoD (Write, W. E., Cell, 
1989, 56: 607-617). 

[001 4] The transcription factor without nuclear transportability in the present invention is not limited as long as it has 
transcription activating ability and DNA binding ability but no nuclear transportability (or extremely low nuclear trans- 

25 portability). Examples are a transcription factor whose nuclear transport signal is deleted or replaced with other amino 
acids, and a fusion protein comprising a DNA binding domain and a transcription activating region. 
[0015] A nuclear pore is generally thought to be able to transfer tow molecular weight substances (molecular 
weights 40 K Da or less) by diffusion as well as a specific active transport system. Even if the active nuclear transport- 
ability of a transcription factor is eliminated by deleting or replacing a nuclear transport signal, a substance can some- 

30 times be transferred into the nucleus by diffusion. In this case, substance transfer into the nucleus by diffusion can be 
completely or minimally inhibited by further adding a signal localizing a protein in a cell other than the nucleus. The tran- 
scription factor without nuclear transportability of the present invention includes factors having a cell localization signal 
other than a nuclear localization signal. Examples of such a signal are a nuclear export signal (Goriich, D., Science. 
1996, 271: 1513-1518), a secretion signal, a peroxisome transport signal, a rough-surfaced endoplasmic reticulum 

35 transport signal, a mitochondrion transport signal (Nakai, K., Genomics, 1 992, 1 4: 897-91 1 ; Nakai, K., PSORT WWW 
server, http^/psort. nibb.ac.jp/), etc., but are not limited thereto. 

[001 6] In addition, there are transcription factors that have multiple nuclear transport signals or that have nuclear 
transportability but for which a nuclear transport signal site in the molecule has not been completely identified (GAL4, 
p53, etc. (Tanaka, M., Cell Science (in Japanese), 1991, 7: 265-272)). In addition, there are transcription factors whose 

40 DNA binding domains or transcription activating domains overlap with their nuclear transport signals and where delet- 
ing or replacing their nuclear transport signals is likely to eliminate their DNA binding ability or transcription activating 
ability. When such a transcription factor is used, even if it is impossible to completely identify the nuclear transport signal 
sequence, a transcription factor without nuclear transportability can be prepared by specifying the region essential for 
eliminating the nuclear transportability and deleting or replacing the region, in addition, a transcription factor without 

45 nuclear transportability can be prepared by generating an artificial hybrid transcription factor in which a DNA binding 
domain of a eukaryote- or prokaryote-derived protein that is known not to have a nuclear transport signal and a tran- 
scription activating domain that is known not to have a nuclear transport signal are fused. The "transcription factor with- 
out nuclear transportability" used herein includes the transcription factors thus prepared. 

[0017] Examples of a transcription activating domain used herein to prepare a transcription factor without nuclear 
so transportability include GAL4 (Brent, R., Cell, 1985, 43: 729-736), Bicoid, c-Fos, c-Myc, v-Myc, B6, B7, B42 (Golemis, 
A. E., Mol. Cell. Biol., 1992, 12: 3006-3014), GCN4 (Hope. I. A., Cell, 1986, 46: 885-894), and VP16 (CLONTECH, 
Mammalian MATCHMAKER Two- Hybrid Assay Kit), but are not limited thereto. In addition, examples of a DNA binding 
domain are those identified in transcription factors such as GAL4 (Giniger, E., Cell, 1985, 40: 767-774), p53 (Chuma- 
kov, P. M., Genetika, 1988, 24: 602-612), GCN4 (Hinnenbusch, A. G., Proc. Natl. Acad. Sci.. 1984, 81: 6442-6446), 
55 VP16 (Triezeneberg, S. J.. Genes. Dev., 1988. 2: 718-729), RelA (Nolan, G. R, Cell, 1991, 64: 961-969). Oct-1 (Strum, 
R. A., Genes. Dev., 1988, 2: 1582-1599), c-Myc (Watt, R.. Nature, 1983, 303: 725-728), c^Jun (Angel, P., Cell, 1988, 55: 
875-885), and MyoD (Write, W. E., Cell. 1989. 56: 607-617). 

[001 8] A DNA encoding a transcription factor without nuclear transportability can be prepared by a method in which 
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the DNA sequenc encoding the nuclear transport signal of the DNA encoding a transcription factor is completely or 
partially deleted, a method in which the sequence within the nuclear transport signal is replaced using site-directed 
mutagenesis, a method in which a cell localization signal other than a nuclear localization signal is added, a method in 
which a transcription activating domain is fused with a DNA biding domain, or a method in which these methods are 
5 appropriately combined. General gene manipulation in these methods is described in the literature (Sambrook, J., 
Molecular Cloning: A Laboratory Manual, 1989. 2nd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
NY.). 

[0019] The test DNA used herein is not limited as long as it encodes a protein or its partial peptide. The test DNA 
includes cDNA, genomic DNA, and synthetic DNA. A DNA encoding a transcription factor without nuclear transporta- 
10 bility can be fused with a test DNA by the usual method (Sambrook, J., Molecular Cloning: A Laboratory Manual, 1989, 
2nd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). 

[0020] A fusion DNA formed by a DNA encoding a transcription factor without nuclear transportability and a test 
DNA is usually inserted into an appropriate expression vector, which is then introduced into a eukaryotic host. The 
expression vector is not limited as long as it can stably express in a eukaryotic host a protein encoded by a fusion DNA 

is formed by a DNA encoding a transcription factor without nuclear transportability and a test DNA. A shuttle vector stably 
maintained in both a host and E. coli is preferable. For example, when baker's yeast is used as a host, a unit for 
expressing the protein is introduced into an integration vector that has no replication origin in it and that is integrated 
into the yeast chromosome, or into a plasmid vector that has a replication origin in it, that exists as a plasmid. (The 
expression unit here comprises a promoter region functioning in baker's yeast (e.g. that of ADH1 or GAL1), the coding 

20 region of an expressed protein, a multiple cloning site, and a terminator region (e.g. that of ADH1). For the plasmid vec- 
tor, a centromere vector (low copy number), a 2 ix vector (high copy number), etc. are commercially available.) Specifi- 
cally, for the integration vector and the centromere vector, pRS vector, which has various auxotrophy marker genes 
(LEU2, HIS3, URA3, TRP1 , etc.) for complementing the host auxotrophy, is commercially available from STRATAGENE 
as a kit, which further comprises a mutant host strain corresponding to the respective marker genes. In addition, a mar- 

25 keted vector (HybriZapll, GAL4 Two-Hybrid Phagemid vector (STRATAGENE), MATCHMAKER vector (CLONTECH), 
etc.) that has various auxotrophy marker genes (LEU2, HIS3, URA3, TRP1, etc.) for complementing the host auxotro- 
phy and that is used in a two-hybrid system, as well as a mutant host strain corresponding to the respective vectors, 
can be used for the 2 p. vector. When animal cells are used as hosts, a marketed and general mammalian expression 
vector, such as a vector that is integrated into the chromosome (pMAM, pMAM-neo (CLONTECH), etc) or a vector 

30 maintained as an episome {XDR2, pDR2 vector system (CLONTECH), etc.), can be used with appropriate host animal 
cells (CHO, Mouse Fibroblast. Hela, U937, BHK, etc.). In addition, a vector for transient expression using COS cells, 
pMT2 (Sambrook, J., Molecular Cloning: A Laboratory Manual, 1989. 2nd Ed. Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY), can be used. The fusion DNA mentioned above can be inserted into an expression vector by 
the usual method (Sambrook, J., Molecular Cloning: A Laboratory Manual, 1989, 2nd Ed. Cold Spring Harbor Labora- 

35 tory Press, Cold Spring Harbor, NY). 

[0021] In addition, the eukaryotic host in the present invention into which the fusion DNA mentioned above is intro- 
duced is not limited as long as it can stably express the protein encoded by the fusion DNA. In particular, yeast and ani- 
mal cultured cells are preferable from the viewpoints of simplicity in handling, ease of introducing and recovering a 
gene, safety, etc. The eukaryotic host used in this invention has in its nucleus a promoter region activated by binding of 

40 a specific transcription factor thereto, and a reporter gene connected to the downstream of the promoter region. 

[0022] The promoter region activated by binding of a specific transcription factor thereto is not limited as long as the 
promoter region comprises a cis regulatory region called the upstream activating sequence (UAS) or operator 
sequence for binding the transcription factor and the TATA sequence and specifically activates transcription by binding 
the transcription factor to the UAS. Examples of the cis regulatory region of baker's yeast are natural GAL1 UAS (com- 

45 prising four GAL4 binding sequences), artificial GAL1 UAS (comprising three GAL4 sequences), LexA UAS (comprising 
one to eight LexA binding sequences) (Estojak, J., Mole. Cell. Biol., 1995, 15: 5820-5829), etc. Examples of the TATA 
sequence are GAL1 TATA, CYS1 (cytochrome C1) TATA, LEU2 TATA, HIS3 TATA, etc. By combining the cis regulatory 
regions and the TATA sequences, various promoter regions whose expression level and induction condition differ from 
one another (CLONTECH, Yeast Protocols Handbook, PT3024-1: 5-8) can be constructed. Specifically, any promoter 

so region that has a transcription factor binding sequence in the cis regulatory region and in which the transcription factor 
regulates the activity of the promoter can be used. 

[0023] In addition, for baker's yeast, which has been thoroughly analyzed genetically, a reporter gene, such as a 
gene related to the auxotrophy of a host (LEU2, HIS3, TRP1 , URA3, etc.), a gene (e.g. GAL1) related to the metabolism 
of an essential nutritive source, or a gene that can complement the deficiency of the other gene essential for survival 
55 enables easily detecting gene expression in terms of the viability of the host. Moreover, generally used reporters, such 
as a reporter gene whose expression can be detected by the enzymatic activity such as p-galactosidase, chloramphen- 
icol acetyltransf erase, or lucif erase, and green fluorescent protein (CLONTECH) whose fluorescence can be directly 
detected leaving cells alive, ar also available. In addition, the generally used reporter gene or the drug resistance gene 
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mentioned previously can also be used in animal cells. 

[0024] The promoter region and the reporter gene mentioned above can be connected by the usual method (Sam- 
brook, J., Molecular Cloning: A Laboratory Manual, 1989, 2nd Ed. Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY). 

5 [0025] A promoter region activated by binding of a specific transcription factor thereto and a reporter gene con- 
nected to the downstream of the promoter region can be introduced into, for example, baker's yeast used as a host by 
the usual method such as the lithium acetate method (CLONTECH, Yeast Protocols Handbook. PT3024-1 : 17-20). A 
d sired gene is integrated into a chromosome or allowed to exist as an intranuclear plasmid, depending on the differ- 
ence in vectors used (the integration vector or the plasmid vector mentioned above). A gene can be introduced into ani- 

10 mal cells by usual methods such as the liposome method (Sambrook/J.. Molecular Cloning: A Laboratory Manual, 
1989, 2nd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). A desired gene is integrated into a chro- 
mosome or allowed to remain as an intranuclear episome depending on the difference in vectors used (integration vec- 
tor or episome vector mentioned above). 

[0026] A commercial eukaryotic host organism having in its cell the promoter region and reporter gene mentioned 
is above can be used. For example, when LexA is used as a DNA binding domain of a transcription factor, yeast 
EGY48[p80P-lacZ] (available from CLONTECH), which has the promoter region containing the LexA operator 
sequence, and reporter genes LEU2 and 0-galactosidase downstream thereof on both the chromosome and the plas- 
mid can be used. 

[0027] A vector comprising a fusion DNA formed by a DNA encoding a specific transcription factor without nuclear 
20 transportability and a test DNA can be introduced into a eukaryotic host, such as baker's yeast used as a host, by the 
usual method such as the lithium acetate method (CLONTECH, Yeast Protocols Handbook. PT3024-1: 17-20). A 
desired gene is integrated into a chromosome or is allowed to exist as an intranuclear plasmid depending on the differ- 
ence in vectors used (the integration vector or the plasmid vector mentioned above). A gene can be introduced into ani- 
mal cells by usual methods such as the liposome method (Sambrook, J., Molecular Cloning: A Laboratory Manual, 
25 1989, 2nd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). A desired gene is integrated into a chro- 
mosome or is allowed to remain as an intranuclear episome depending on the difference in vectors used (the integration 
vector or the episome vector mentioned above). 

[0028] The expression of a reporter gene in the transfer marrts thus obtained can easily be detected based on the 
viability of the host, for example, baker's yeast, which has been subjected to thorough genetic analysis, when a gene 

30 (LEU2, HIS3, TRP1 , URA3, etc.) related to the auxotrophy of the host, a gene (GAL1 , etc.) related to the metabolism 
of an essential nutritive source, or a gene that can complement the deficiency of another gene essential for survival is 
used. In addition, p -galactosidase, chloramphenicol acetyltransferase. or luciferase, which is a generally used reporter, 
enables detecting the expression of a reporter gene based on the enzymatic activity, and green fluorescent protein 
(CLONTECH) enables directly detecting the fluorescence emitted by living cells. In animal cells, the expression of the 

35 generally used reporter gene or drug resistance gene mentioned previously can be detected similarly. A test DNA is 
thought to encode a peptide with nuclear transportability if the expression of the reporter gene is detected, but not if the 
expression of the reporter gene is not detected. 

[0029] The second aspect of the present invention relates to a method for isolating a test DNA encoding a peptide 
with nuclear transportability. This method comprises introducing a fusion DNA formed by a DNA encoding a transcrip- 

40 tion factor without nuclear transportability and a test DNA into a eukaryotic host having in its nucleus a promoter region 
activated by binding of said transcription factor thereto and a reporter gene connected to the downstream of said pro- 
moter region, detecting the expression of said reporter gene, and isolating the test DNA from the eukaryotic host in 
which the expression has been detected. The test DNA can be isolated from the eukaryotic host, for example baker's 
yeast, in which the expression has been detected, by transforming E. coll with the plasmid isolated from a single colony 

45 and by isolating the plasmid again from said transformant when the test DNA exists on a plasmid (yeast-E. coii shuttle 
vector). Alternatively, the test DNA can be isolated by PCR amplification using the total DNA isolated from a single col- 
ony as a template (CLONTECH, Yeast Protocols Handbook, PT3024-1 : 29-37). The test DNA can basically be isolated 
from animal cells by PCR amplification using the total DNA isolated from a single colony as a template. 
[0030] A further aspect of the present invention relates to a vector comprising a DNA encoding a transcription factor 

so without nuclear transportability and an introduction site for the test DNA adjacent thereto, and to a kit comprising said 
vector and a eukaryotic host. This eukaryotic host has in its nucleus an expression unit comprising a promoter region 
to which said transcription factor binds and a reporter gene connected to the downstream of said promoter region. After 
a test DNA is introduced into the introduction site for the test DNA, the vector of the present invention is introduced into 
the eukaryotic host having in its nucleus an expression unit composed of a promoter region to which said transcription 

55 factor binds and a reporter gene connected to the downstream of said promoter region. The introduction site for the test 
DNA is usually a site cleaved with a unique restriction enzyme on the vector. Introducing the vector into the eukaryotic 
host is thought to cause the test DNA introduced into the vector to encode a peptide with nuclear transportability if the 
expression of the reporter gene is detected in the eukaryotic host, but not if if the expression of the reporter gene is not 



6 



EP 0 995 797 A1 

detected, tt is thus possible to easily detect whether the test DNA encodes a peptide with nudear transportability and 
to easily isolate the DNA encoding a peptide with nuclear transportability. In particular, constructing a DNA library with 
the v ctor mentioned above, introducing th library into the eukaryotic host mentioned above, and detecting the expres- 
sion of the reporter gene enables efficient and complete isolation of the DNA encoding a peptide with nuclear transport- 
5 ability from the library. 

Brief Description of the Drawings 
[0031] 

w 

Figure 1 shows the plasmid pLexAD. 
Figure 2 shows the plasmid pLexADrev. 
Figure 3 shows the plasmid pRSl F. 
Figure 4 shows the plasmid pRS3F. 
75 Figure 5 shows the assay for the nuclear transportability of the transcription factor used for fusing with a test pep- 
tide. 

Figure 6 shows the assay for the nuclear transportability of the transcription factor fused with the test peptide. 
Figure 7 shows the plasmid pNS. 

Figure 8 shows the assay for the nuclear transportability of various peptides, using the plasmid pNS. 

20 

Best Mode for Implementing the Invention 

[0032] In the following, the present invention is explained in more detail with examples, but is not to be construed 
to be limited thereto, in the examples, the basic genetic engineering techniques followthe literature unless otherwise 

25 specified (Sambrook, J., Molecular Cloning: A Laboratory Manual, 1989, 2nd Ed. Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY). Products for genetic engineering such as restriction enzymes or other modifying enzymes 
were purchased from Takara Shuzo and used under the conditions described in the product manuals. QIAprep Kit (Ql A- 
GEN) was used to isolate a plasmid from E. coli. ABI PRISM 377 (PERKIN ELMER) was used to determine the nucle- 
otide sequence. Reagents from the same company were used to prepare samples for analysis according to the product 

30 manuals. Yeast was manipulated using the MATCHMAKER LexA Two-Hybrid System (CLONTECH) (medium, host, 
shuttle vector, method for gene introduction, assay method for the reporter gene, method for gene isolation, etc.) follow- 
ing Yeast Protocols Handbook, which accompanied the system. In addition, synthesis of a custom oligonucleotide was 
ordered from Toa Gosei. 

35 Example 1 Construction of the nuclear transport protein trap vector 

(1) PCR amplification of the DNA sequence encoding GAL4 transcription activating domain 

[0033] DNA fragments comprising the DNA sequence encoding the GAL4 transcription activating domain (the 
40 nucleotide sequence is shown in SEQ ID NO: 3) were amplified on GeneAmp PCR System 2400 (PERKIN ELMER) 
using primer NU13 (SEQ ID NO: 1), which has an add-in EcoRI site at its S'-terminus, and MATCHMAKER 3* AD LD- 
Insert Screening Amplimer (CLONTECH) (SEQ ID NO: 2) as primers, and using plasmid pACT2 (CLONTECH) as a 
template. TaKaRa Ex Taq (TaKaRa) was used as Taq polymerase under the conditions described in the product man- 
ual. The DNA fragments thus amplified were purified by ethanol precipitation and digested with restriction enzymes 
45 EcoRI and Ncol, then subjected to polyacrylamide gel electrophoresis using 6% polyacrylamide gel. The desired DNA 
fragment was then cut out of the gel and recovered by the electroelution method. 

(2) construction of the vector pLexAD expressing the fusion protein formed by the LexA protein and the GAL4 transcrip- 
tion activating domain 

so 

[0034] pLexAD was constructed by inserting the DNA fragment encoding the GAL4 transcription activating domain 
of (1) above between the EcoRI and Ncol sites within the multiple cloning site of plasmid pLexA (CLONTECH) (Fig. 1). 
Nucleotide sequence determination confirmed that the desired fragment was correctly inserted. The nucleotide 
sequence of the LexA gene is shown in SEQ ID NO: 4. 

55 
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(3) Construction of the vector pLexADrev, in which the nuclear export signal (NES) is inserted into the N t rminus of 
LexA 

[0035] The nuclear export signal (SEQ ID NO: 5) of the Rev protein of HIV was synth sized as follows and inserted 
into the Hpal site near the N terminus of the LexA protein encoded by pLexAD. NU9 (SEQ ID NO: 6) and NU10 (SEQ 
ID NO: 7) were synthesized as the sense and antisense strands, respectively; phosphorylated at their 5' termini with T4 
polynucleotide kinase; and annealed with each other. This DNA fragment was digested with Hpal and inserted into 
pLexAD that had been dephosphorylated with alkaline phosphatase, to construct pLexADrev (Fig. 2). Nucleotide 
sequence determination confirmed that the desired fragment was properly inserted. 

(4) Construction of plasmid pRS1 F having at its replication origin the CEN/ARS region for expressing the fusion protein 
formed by the LexA protein and the GAL4 transcription activating domain, and plasmid pRS3F having at its replication 
origin the CEN/ARS region for expressing the fusion protein formed by the LexA protein into which NES is inserted and 
the GAL4 transcription activating domain 

[0036] The minimal unit for expressing in yeast the fusion protein formed by the usual LexA protein without the 
nuclear export signal (NES) and the GAL4 transcription activating domain is the DNA fragment of about 1 .7 kb obtained 
by digesting pLexAD with Sphl. The minimal unit for expressing in yeast the fusion protein formed by the LexA protein 
into which NES is inserted at its N terminus and the GAL4 transcription activating domain (the nucleotide sequence 
described with the amino acid sequence of the fusion protein is shown in SEQ ID NO: 8) is the DNA fragment of about 
1 .7 kb obtained by digesting pLexADrev with Sphl. This expression unit comprises the ADH1 promoter region, the cod- 
ing region of a protein to be expressed, the multiple cloning site, and the ADH1 terminator region. After being purified, 
the DNA fragments of the respective expression units were inserted into the Sphl site of the vector pRSF. The Sphl site 
was constructed by replacing the Pvull fragment comprising the multiple cloning site of plasmid pRS413 (STRATA- 
GENE) (yeast shuttle vector, CEN/ARS origin) with the Pvull fragment comprising the multiple cloning site of the gen- 
erally used plasmid p(JC19, to construct pRS1F and pRS3F (Figures 3 and 4, respectively). Nucleotide sequence 
determination confirmed that the desired fragment was properly inserted. Since pRSI F (positive control) and pRS3F 
thus constructed have the pLexA-derived multiple cloning site immediately after the gene encoding a fusion protein 
functioning as a transcription factor, a DNA fragment such as the desired cDNA can easily be fused and expressed by 
the usual method. 

Example 2 Demonstration of effectiveness of the nuclear transport protein trap vector pRS3F by fusing the cDNA of an 
artificial nuclear transport protein 

(1) Fusing known cDNA fragments 

[0037] The known cDNA fragment, Pseudomonas aeruginosa branched-chain-ami no-acid binding protein (BraC), 
is localized in cytoplasm whose secretion signal had been deleted (the nucleotide sequence described with the amino 
acid sequence of said protein is shown in SEQ ID NO: 9) (Tanaka, M., New Lectures on Biochemical Experiments, Vol. 
6, Ed. Japanese Biochemical Society, Biomembrane and Membrane Transport, Vol. 2, 1992, Tokyo Kagaku Dojin, 
9*15). This known cDNA fragment was fused at its N terminus with the nuclear transport signal derived from SV40 
large T antigen to construct a cDNA fragment encoding an artificial nuclear transport protein. This fragment was then 
fused in-frame to the C terminus of the GAL4 transcription activating domain of pRS3F. Specifically, the DNA fragment 
(Ncol-Dral) encoding 'BraC was inserted into pRS3F, which had been digested with Xhol, blunted with Klenow treat- 
ment, digested with Ncol, and purified to construct pRS3F*BraC. Moreover, the synthetic DNA fragment encoding the 
nuclear transport signal derived from SV40 large T antigen (SEQ ID NO: 10) was inserted into pRS3F'BraC that had 
been digested with Nhel and Ncol and purified to construct pRS3FN'BraC. The synthetic DNA fragment was prepared 
in such a way that NU17 (SEQ ID NO: 11) and NU18 (SEQ ID NO: 12) were synthesized as the sense and antisense 
strands, respectively; phosphorylated at their 5' termini with T4 polynucleotide kinase; and annealed with each other. 
As control, pRS3FN, which had the nuclear transport signal but not the 'BraC fragment, was constructed similarly. 
Nucleotide sequence determination confirmed that the desired fragment was properly inserted. 

(2) Assay for nuclear transportability by the expression of a reporter gene 

[0038] Host yeast EGY48[p80P-lacZ] (CLONTECH), which has the promoter region containing the LexA operator 
sequence (SEQ ID NO: 13) (Estojak, J. Mole. Cell. Biol., 1995, 15: 5820-5829), and reporter genes LEU2 and (3-galac : 
tosidase downstream thereof on both the chromosome and plasmid, was transformed with the three plasmids 
pRS3FBraC, pRS3FN*BraC, and pRS3FN or the plasmids constructed in Example 1 , pRS1F and pRS3F. Introduction 
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of the desired plasmid into hosts was confirmed by the complementarity of HIS, an auxotrophy marker. Each of the 
transformants was then replicated in a medium (SD/-LEU, -HIS, -URA, X-gal) to assay the expression f the reporter 
genes and cultivated at 30°C for two to three days. As a result, reporter gen s p-galactosidase and LEU2 wer both 
expressed in the transformants carrying pRS3FN'BraC in which the artificial nuclear transport protein had b en fused, 

5 pRS3FN in which only the nuclear transport signal had been fused, or pRS1 F, a positive control, and both blue coloring 
and normal growth were confirmed (Figs. 5 and 6). In contrast, the reporter genes were hardly expressed in the trans- 
formants carrying pRS3FBraC in which the protein without the nuclear transport signal had been fused, and pRS3F in 
which nothing had been fused, and neither blue coloring nor normal growth were observed (Figs. 5 and 6). 
[0039] The above results indicate that whether a DNA fragment encoding a peptide has nuclear transportability or 

10 not could be easily determined by fusing the DNA fragment in-frame to the C terminus of the transcription factor 
encoded by pRS3F, expressing the fusion protein in yeast, and detecting the expression of the reporter gene as an indi- 
cation. 

Example 3 Construction of the vector pNS for preparing a cDNA library 

15 

[0040] pRS3F was modified as follows. (1) The EcoRI site at the junction between LexA and GAL4AD was 
removed. (2) A new EcoRI site was introduced into the multiple cloning site. (3) The unnecessary region derived from 
pRS413 was removed to minimize the size of the vector. 

[0041] First, a synthetic linker, with sense strand NU31 (SEQ ID NO: 14) and antisense strand NU30 (SEQ ID NO: 
20 15), was inserted into the EcoRI she of pLexADrev to obtain the plasmid pLexADrev-dE. The DNA fragment of about 
1.7 kb containing the ADH1 expression unit and obtained by digesting pLexADrev-dE with a restriction enzyme Sphl 
was subcloned into the Sphl site of a generally used plasmid pUCi 9 to obtain the plasmid pULexADrev-dE. A synthetic 
linker with an EcoRI site and with sense strand NU28 (SEQ ID NO: 16) and antisense strand NU29 (SEQ ID NO: 17), 
was inserted between the Nhel site and the Ncol site of pULexADrev-dE to obtain the plasmid pULexADrev-E. pRS413 
25 was digested with Dralll and Pvull to remove a 757 bp DNA fragment comprising the multiple cloning site. A synthetic 
linker with an Sphl site, whose sense strand was NU25 (SEQ ID NO: 18) and antisense strands was NU26 (SEQ ID 
NO: 19) was inserted into the digested plasmid to obtain the plasmid pRS-S. The DNA fragment of about 1.7 kb con- 
taining the ADH1 expression unit and obtained by digesting the above-mentioned plasmid pULexADrev-E with Sphl 
was inserted into the Sphl site of pRS-S to construct the vector pNS for preparing a cDNA library (Fig. 7). (The tran- 
30 scription direction by ADH1 is the same as that of HIS3.) 

Example 4 Construction of the fusion protein-expression library (derived from human cultured cells, NT2 precursor 
cells) and the nuclear transport assay 

35 (1) Construction of the fusion protein-expression Itorary 

[0042] Human cultured cells (NT2 precursor cells (Stratagene)) were cultivated following the recommended proto- 
col (Catalog #204101, Revision #036002a), and the mRNA was prepared using the commercial total mRNA extraction 
kit and the mRNA extraction kit (Pharmacia). A cDNA library was constructed using a 3 ^gportion of the mRNA 

40 obtained above and the commercial cDNA synthetic kit (Pharmacia). Specifically, cDNA was synthesized using 
oligo(dT)12-18 primer and inserted into the EcoRI/Notl sites of the pNS vector. The cDNA was unidirectionally inserted 
using the Directional Cloning Toolbox (Pharmacia). Commercial E. coli (ElectroMAX DH10B Cells from GibcoBRL) was 
transformed with a part of the cDNA library thus constructed by electroporation (Gene Pulser from BIO RAD) according 
to the usual method (New Protocols for Cell Engineering Experiments, Shujunsha, 114-115). The transformants 

45 obtained were cultivated on LB agar media comprising ampicillin (100 \i g/ml) at 30 °C for 16 hours. After harvesting, 
the plasmids were prepared (QIAGEN Maxi kit from QIAGEN). 

(2) Nuclear transport assay using yeast 

so [0043] EGY48 was transformed with 60 ug of the plasmid of the fusion protein-expression library prepared above 
by the usual method (CLONTECH, Yeast Protocols Handbook, PT3024-1: 17-20). The transformants were cultivated 
on SD agar media (-HisALeu) at 30°C for three to seven days to screen the clones for expression of reporter gene 
LEU2, and about 1 ,000 positive clones were obtained. 

55 (3) Nucleotide sequencing 

[0044] The nucleotide sequence of the cDNA fragment inserted into the vector was determined with 12 of the pos- 
itive clones so obtained. To determine the nucleotide sequence, the template DNA was first prepared from each clone 
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by colony PCR. A small amount of bacterial cells harvested from each clone was added to 20 uJ of a PCR reaction mix- 
ture (0.5 units of thermotolerant DNA polymerase (Ex Taq from TaKaRa). 4 nmol of dNTP mixture, 0.4 pmol each of 
primer NU15 (SEQ ID NO: 20) and primer NU36 (SEQ ID NO: 21), 2 ul of the attached buffer, and sterilized wat r). The 
inserted cDNA fragment was then amplified on GeneAmp PCR System 2400 (PERKIN ELMER) for 40 cycles of incu- 
5 bation at 94°C for denaturation, incubation at 60° C for annealing, and incubation at 72°C for extension. Unreacted prim- 
rs were desalted and removed from each PCR product using Microcon-100 (Millipore) to obtain the template DNA. A 
1 00 to 200 ng portion of the template so obtained was sequenced by the method recommended in the product manual 
f ABI. 

w (4) Database analysis of the obtained clones 

[0045] The nucleotide sequence of each clone was searched in a public database, Basic BLAST 
(rrttp^/www.ncbi .nlm. nih.gov/cgi -bin/BLAST/npn-blast?Jform=0) of the National Center for Biotechnology Information 
(NCBI). As a result, all 12 clones were found to coincide with known genes. It has been so far reported or suggested 

15 that 10 of the 12 clones function in the nuclei. Of those 10, five clones, NP220 (InagaW, H., J. Biol. Chem., 1996, 271 : 
12525-12531), PC4 (Ge, H., Cell, 1994, 78: 513-523), ERC-55 (Imai, T., Biochem. Biophys. Res. Commun., 1997, 233: 
765-769), histone binding protein (O'Rand. M. G.. Dev. Biol., 1992, 154: 37-44), and prothymosin a1 (Manrow, R. E., J. 
Biol. Chem., 1991, 266: 3916-3924), have sequences rich in basic amino acids similar to the nuclear transport signal 
of SV40 large T antigen type. Furthermore, one clone. hnRNPAl (Michael. W. M., Cell, 1995, 83: 415-422), has the M9 

20 sequence, which plays a role in both nuclear import and export, and four clones, ferritin H chain (Cai, C. X., J. Biol. 
Chem., 1997, 272: 12831-12839), chaperonine 10 (Bonardi, M. A., Biochem. Biophys. Res. Commun., 1995, 206: 260- 
265), protein kinase C inhibitor-l (Brzoska, P. M., Proc. Natl. Acad. Sci., 1995, 92: 7824-7828), and steroid receptor co- 
activator 1 (Onate, S. A., Science. 1995, 270: 1354-1357), have no known nuclear transport signal. No known nuclear 
transport signal was found in the remaining two clones, tropomyosin (Un, C.-S., Mol. Cell. Biol., 1988, 8: 160-168) and 

25 G-rich sequence factor-1 (Qian, Z.. Nudeic Acids Res., 1994, 22: 2334-2343), whose function in the nuclei had not 
been reported to date. 

Example 5 Construction of the fusion protein-expression library (derived from human fetal brain) and nuclear transport 
assay 

30 

(1) construction of the fusion protein-expression library 

[0046] First, the commercial human fetal brain cDNA library (SUPERSCRIPT library from GibcoBRL) was amplified 
according to the recommended protocol. The plasmid having the cDNA insert was then purified using the plasmid iso- 

35 lation kit from QIAGEN. Next, the cDNA fragments excised from the portion thereof (30 ug) with two kinds of restriction 
nzymes, EcoRI and Notl, were grouped by size within 0.7 to 4 kb and purified by 0.8% agarose gel electrophoresis. 
The cDNA fragments so obtained were inserted into the EcoRI/NotI sites of the.pNS vector mentioned above to con- 
struct the fusion protein-expression library. The commercial E. colt (ElectroMAX DH10B Cells from GibcoBRL) was 
transformed with the portion of the library thus obtained by electroporation (Gene Pulser from BIO RAD) according to 

40 the usual method (New Protocols for Cell Engineering Experiments, Shujunsha, 114-115). The resulting transformants 
were cultivated on LB agar media comprising ampicillin (100 ug/ml) at 30°C for 1 6 hours. After harvesting, the plasmids 
were prepared (QIAGEN Maxi kit from QIAGEN). 

(2) Nuclear transport assay using yeast 

45 

[0047] EGY48 was transformed with 60 ug of the plasmid of the fusion protein-expression library prepared by the 
usual method (CLONTECH, Yeast protocols Handbook, PT3024-1: 17-20). The transformants were cultivated on SD 
agar media (-His/- Leu) at 30°C for three to seven days to screen the clones for expression of the reporter gene LEU2, 
and about 1 ,000 positive clones were obtained. 

50 

(3) Nucleotide sequence determination 

[0048] The nucleotide sequence of the cDNA fragment inserted into the vector was determined with 489 of the pos- 
itive clones so obtained. To determine the nucleotide sequence, the template DNA was first prepared from each clone 
55 by colony PCR. A small amount of bacterial cells harvested from each clone was added to 20 n I of a PCR reaction mix- 
ture (0.5 unit of thermotolerant DNA polymerase (Ex Taq from TaKaRa), 4 nmol of the dNTP mixture. 0.4 pmol each of 
primer NU1 5 (SEQ ID NO: 22) and primer NU36 (SEQ ID NO: 23), 2 ul of the attached buffer, and sterilized water). The 
cDNA insert was then amplified on GeneAmp PCR System 9600 (PERKIN ELMER) for 40 cycles of incubation at 94°C 
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for denaturation, incubation at 60°C for annealing, and incubation at 72°C for extension. Unreacted primers were 
desalted and r moved from each PCR product using a Microcon-100 (Millipore) to obtain the template DNA. A 100 to 
200 ng portion of the template so obtained was subjected to nucleotid sequence determination by the method accord- 
ing to the product manual of ABI. 

5 

(4) Database analysis of the obtained clones 

[0049] The nucleotide sequences of the 489 clones were searched in a public database, Basic BLAST 
(http-7/www.ncbi.nlm.nih.gov/cgi^in/BU^ST/nph-blast?Jfbrm=0) of the National Center for Biotechnology Information 
io (NCBl). As a result, 250 clones coincided with the 97 genes encoding known proteins (Tables 1 and 2), and 220 clones 
coincided with the 172 genes encoding novel sequences or known Expressed Sequence Tag (EST) sequences as can- 
didates for genes encoding novel nuclear transport proteins. In 19 clones, the sequences were derived from the non- 
coding regions of known genes or the reading frames of the codons were shifted. 

[0050] Of the genes isolated by the method of the present invention, Table 1 shows the genes encoding proteins 
is reported to function in the nuclei and Table 2 shows the genes not reported to function in the nuclei. 



11 



EP 0 995 797 A1 



I 



222232233S223223222S32333S22S323S:522D:32322S32 



ft.d k— -C «0 O XO > W« ZflCMOH l*J MCOXCOsfU, OE>X>JU. ^^OOH> 

oo uj _j t*J sc Sui Sou. a. vuj-jO u-«-t MnotHOiD uiQSaeo ouih-ox< 

CT — » u I » i scea uia.a o»uio< «3»f x: g» o -< a. o -«-»>- >- xa-J<0>- 



3 3" 



1 i.i 



O- *0 — « 07 



1 §6 a usiiaa aaaiafaa asaSalala - 

i y in ii Hiii ^Mi§ 

<a too co «x. Lk_ ca -< a ^ 3». cr» CxuiM>H csa^xa. u.owqoui 



€=> CDUJ< 5- -J > -< Ol.^ UVU< — UJlU-J -i^zujz ujZjujc 

o =e a. i— u- o •-• ^ O ac S* cooagoatfp oco-iko oui<< 




OQ2 
a> < m 



12 



EP 0 995 797 A1 




45 

[0051 ] The symbols in the Tables are defined as follows. 

a The shortest insert among the obtained genes grouped as a representative. 
b Ten amino acids from the N terminus of the protein encoded by the gene insert. 
so 0 Medline Unique Identifier of the literature reporting the function in the nucleus 
* Clone comprising the whole coding region 

[0052] S/R rich, region rich in serine/arginine; NLS, putative nuclear transport signal rich in basic amino acid resi- 
dues; ZIP, leucine zipper; bZIP, basic leucine zipper; KNS, hnRNP K nuclear transport signal; arm, armadillo repeat; 
55 bHLHZIP, basic helix-loop-helix leucine zipper; SH3 Src homology domain 3; ITAM, immunoreceptor tyrosine-based 
activation motif 

[0053] As shown in Tables 1 and 2, at least half of the 97 known proteins are reported to function in the nuclei. In 
particular, many transcription factors and DNA/RNA binding proteins are included. Hence, novel genes encoding 
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unknown proteins that function in the nuclei have been efficiently and specifically obtained. In addition, KNS sequence 
(Matthew, W., EMBO J., 1997, 16: 3587-3598). which plays a role in both nuclear import and export, has been found in 
the hnRNP K protein of the isolated clones. The fact that the M9 sequence and the KNS sequence, which play a role in 
both nuclear import and export, have been found in the clones isolated by the method of the present invention demon - 
5 strates not only that the present invention can efficiently and specifically screen nuclear transport proteins but also that 
it can be expanded to screen nuclear import and export proteins (both import into the nucleus and export from the 
nucleus). 

Example 6 Demonstration of the effectiveness of the nuclear transport protein trap vector pNS by fusing the cDNA 
10 encoding a known nuclear transport protein 

(1) Construction of a fusion plasmid of known cDNA fragments 

[0054] The cDNAs of 'BraC (Tanaka, M., New Lectures on Biochemical Experiments, Vol. 6, Ed. Japanese Bio- 

15 chemical Society, Biomembrane and Membrane Transport, Vol. 2, 1992, Tokyo Kagaku Dojin, 9-15) and Ca 2+ /calmod- 
ulin-dependent protein kinase CaMKK (Tokumitsu, H., J. Biol. Chem., 1995, 270(33): 19320-19324; Tokumitsu, R, 
Intracellular localization of CaMKK, unpublished data) were used as representative proteins localized in the cytoplasm. 
In addition, the cDNAs of NLS of SV40, NLS-'BraC, in which NLS of SV40 and 'BraC were artificially fused; transcription 
factor NF-kappa-B p65 subunit NFKBp65 (Ganchi, P. A., Mol. Biol. Cell, 1992, 3(12): 1339-1 352) ;and transcription fac- 

20 tor c-Fos (Tratner, I., Oncogene, 1991 , 6(1 1): 2049-2053) were used as representative proteins localized in the nuclei 
and having the typical nuclear transport signal. The plasmid pRS1 F was used for LexAD, pNS for NES-LexAD, pRS3FN 
for NES-LexAD-NLS, pRS3F*BraC for NES-LexAD -'BraC, and pRS3FN'BraC for NES-LexAD-NLS-'BraC. After 
NFKBp65 was amplified by PCR using pME18S(N)-p65 (Tsuboi, A., Biochem. Biophys. Res. Commun., 1994, 199(2): 
1064-1072) as a template and NU32 (SEQ ID NO: 24) and NU24 (SEQ ID NO: 25) as primers, the fragment was 

25 digested with restriction enzymes Muni and Notl, and inserted into the EcoRI/Notl sites of pNS to construct NES- 
LexAD-NFKBp65. Similarly, after c-Fos was amplified by PCR using pME18S(N)-c-Fos (Tsuboi, A., Biochem. Biophys. 
Res. Commun., 1994, 199(2): 1064-1072) as a template and NU34 (SEQ ID NO: 26) and NU24 as primers, the frag- 
ment was inserted into the EcoRI/Notl site of pNS to construct NES-LexAD-cFos. The CaMKK cDNA fragment gener- 
ated by digesting pET-CaMKK (gift from H. Tokumitsu) with the restriction enzyme Ncol was inserted into the Ncoi site 

30 of pNS to construct NES-LexAD -CaMKK. 

(2) Detection of nuclear transportability by the expression of a reporter gene 

[0055] Each plasmid was introduced into the EGY48 strain, and the expression of reporter gene LEU2 was 
35 observed. Specifically, the yeast strain was transformed with the various plasmids mentioned in (1) and directly plated 
nto SD media (-HIS, -LEU). Figure 8 shows the results of cultivation at 30°C for three days. A colony formed in LexAD, 
which has no NES, probably because of passive diffusion into the nucleus. In contrast, colony formation was completely 
suppressed in NES-LexAD, where NES had been introduced. However, a colony did form in NES-LexAD-NLS, where 
NLS had been introduced. Similarly, colony formation was observed in NES-LexAD-NLS-'BraC, NES-LexAD -NFKBp65, 
40 and NES-LexAD-cFos that had typical NLS. In contrast, colony formation was completely suppressed in NES-LexAD - 
BraC and NES-LexAD-CaMKK, which had no nuclear transportability. These results demonstrate that the system using 
the pNS vector can specifically detect a cDNA fragment with nuclear transportability. 

Industrial Applicability 

45 

[0056] The present invention enables easy determination of whether a peptide encoded by a test DNA has nuclear 
transportability or not by using the expression of a reporter gene as an indication. The present invention also enables 
quick, efficient and systematic cloning of a DNA encoding a protein with nuclear transportability by detecting the expres- 
sion of a reporter gene as an indication. Furthermore, the present invention not only promotes obtaining a DNA encod- 
50 ing a novel intranuclear protein with biologically important function extremely well but also provides gene expression 
information (stage, position, expression frequency, etc.) that is very useful for studying the function of intranuclear pro- 
t ins. The use of the information is expected to contribute significantly to the development of epoch-making pharma- 
ceuticals. 

55 
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Sequence Listing 

(1) Nase of Applicant: Helix Research Institute 

(2) Title of the Invention: Method for Detecting and Isolating Nuclear Transport 
Protein 

(3) Reference Number: H1-804DP1PCT 

(4) Application Nu&ber: 

(5) Filing Date: 

(6) Country *here the priority application was filed and the application mmber 
of the application: 

Japan, No. Hei 9-124795 
Japan, No. Hei 9-309686 

(7) Priority date: April 28, 1998 and October 24, 1998 

(8) Number of Sequences: 26 

SEQ ID NO: 1: 
SEQUENCE LENGTH: 30 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 1 

TTTGAATTCG CCAATTTTAA TCAAAGTG6G 30 

SEQ ID NO: 2: 
SEQUENCE LENGTH: 32 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 2 

TAGCATCTAT GACTTTTTGG GGCGTTCAAG TG 32 



SEQ ID NO: 3: 
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SEQUENCE LENGTH: 342 
SEQUENCE TYPE: nucleic acid 
STRAND EDNESSS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA to «BNA 
FEATURE: 

NAME/KEY: doaain 
LOCATION: 1..342 
IDENTIFICATION METHOD: S 
15 SEQUENCE DESCRIPTION: SEQ ID NO: 3 

GCC AAT TTT AAT CAA AGT GGG AAT An GCT GAT AGC TCA TTG TCC TTC 48 
Ala Asn Pbe Asn Gin Ser Gly Asn He Ala Asp Ser Ser Leu Ser Phe 
1 5 10 15 

ACT TTC ACT AAC AGT AGC AAC GGT CCG AAC CTC ATA ACA ACT CAA ACA 96 
Thr Phe Thr Asn Ser Ser Asn Gly Pro Asn Leu He Thr Thr Gin Thr 
20 25 30 

25 AAT TCT CAA GCG CTT TCA CAA CCA ATT GCC TCC TCT AAC GTT CAT GAT 144 

Asn Ser Gin Ala Leu Ser Gin Pro He Ala Ser Ser Asn Val His Asp 

35 40 45 

AAC TTC ATG AAT AAT GAA ATC ACG GCT AGT AAA ATT GAT GAT GGT AAT 192 
Asn Phe Met Asn Asn Glu He Thr Ala Ser Lys He Asp Asp Gly Asn 

50 55 60 

AAT TCA AAA CCA CTG TCA CCT GGT TGG ACG GAC CAA ACT GCG TAT AAC 240 
Asn Ser Lys Pro Leu Ser Pro Gly Trp Thr Asp Gin Thr Ala Tyr Asn 
65 70 75 80 

GCG TTT GGA ATC ACT ACA GGG ATG TTT AAT ACC ACT ACA ATG GAT GAT 288 
Ala Phe Gly He Thr Thr Gly Met Phe Asn Thr Thr Thr Met Asp Asp 
w 85 90 95 

GTA TAT AAC TAT CTA TTC GAT GAT GAA GAT ACC CCA CCA AAC CCA AAA 336 
Val Tyr Asn Tyr Leu Phe Asp Asp Glu Asp Thr Pro Pro Asn Pro Lys 
100 105 110 

45 AAA GAG 342 

Lys Glu 

so SEQ ID NO: 4: 

SEQUENCE LENGTH: 609 
SEQUENCE TYPE: nucleic acid 

ss 



30 



35 
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STEANDEDNESSS: double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA to nRNA 
FEATURE: 

NAME/KEY: CDS 
LOCATION: 1..606 
IDENTIFICATION METHOD: S 
SEQUENCE DESCRIPTION: SEQ ID NO: 4 

ATG AAA GCG TTA ACG GCC AGG CAA CAA GAG GTG TTT GAT CTC ATC CGT 48 
'5 Met Lys Ala Leu Thr Ala Arg Gin Gin Glu Val Phe Asp Leu He Arg 

15 10 15 



10 



20 



25 



35 



40 



50 



GAT CAC ATC AGC CAG ACA GGT ATG CCG CCG ACG CGT GCG GAA ATC GCG 96 
Asp His He Ser Gin Thr Gly Met Pro Pro Thr Arg Ala Glu He Ala 
20 25 30 



CAG CGT TTG GGG TTC CGT TCC CCA AAC GCG GCT GAA GAA CAT CTG AAG 144 
Gin Arg Leu Gly Phe Arg Ser Pro Asn Ala Ala Glu Glu His Leu Lys 

35 40 45 

GCG CTG GCA CGC AAA GGC GTT ATT GAA ATT GTT TCC GGC GCA TCA CGC 192 
ao Ala Leu Ala Arg Lys Gly Val He Glu He Val Ser Gly Ala Ser Arg 

50 55 60 

GGG ATT CGT CTG TTG CAG GAA GAG GAA GAA GGG TTG CCG CTG GTA GGT 240 
Gly He Arg Leu Leu Gin Glu Glu Glu Glu Gly Leu Pro Leu Val Gly 
65 70 75 80 

CGT GTG GCT GCC GGT GAA CCA CTT CTG GCG CAA CAG CAT ATT GAA GGT 288 
Arg Val Ala Ala Gly Glu Pro Leu Leu Ala Gin Gin His He Glu Gly 

85 90 95 

CAT TAT CAG GTC GAT CCT TCC TTA TTC AAG CCG AAT GCT GAT TIC CTG 336 
His Tyr Gin Val Asp Pro Ser Leu Phe Lys Pro Asn Ala Asp Phe Leu 
100 105 110 

« CTG CGC GTC AGC GGG ATG TCG ATG AAA GAT ATC GGC An ATG GAT GGT 384 

Leu Arg Val Ser Gly Met Ser Met Lys Asp He Gly He Met Asp Gly 

115 120 125 

GAC TTG CTG GCA GTG CAT AAA ACT CAG GAT GTA CGT AAC GGT CAG GTC 432 
Asp Leu Leu Ala Val His Lys Thr Gin Asp Val Arg Asn Gly Gin Val 
130 135 140 
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GTT GTC GCA CGT ATT GAT GAC GAA GTT ACC GTT AAG CGC CTG AAA AAA 480 

Val Val Ala Arg He Asp Asp Glu Val Thr Val Lys Arg Leu Lys Lys 

145 150 155 160 

CAG GGC AAT AAA GTC GAA CTG TTG CCA GAA AAT AGC GAG TTT AAA CCA 528 

Gin Gly Asn Lys Val Glu Leu Leu Pro Glu Asn Ser Glu Phe Lys Pro 

165 170 175 

ATT GTC GTT GAC CTT CGT CAG CAG AGC TTC ACC ATT GAA GGG CTG GCG 576 
He Val Val Asp Leu Arg Gin Gin Ser Phe Thr lie Glu Gly Leu Ala 

180 185 190 

GTT GGG GTT ATT CGC AAC GGC GAC TGG CTG TAA 609 
Val Gly Val Ue Arg Asn Gly Asp Trp Leu 
195 200 

SEQ ID NO: 5: 

SEQUENCE LENGTH: 10 

SEQUENCE TYPE: aaino acid 

TOPOLOGY: linear 

MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO: 5 

Gin Leu Pro Pro Leu Glu Arg Leu Thr Leu 

1 5 10' 



SEQ ID NO: 6: 
SEQUENCE LENGTH: 30 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 6 

ACAGCTGCCA CCGATTGAGA GACTTACGTT 30 

SEQ ID NO: 7: 
SEQUENCE LENGTH: 30 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
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10 



SEQUENCE DESCRIPTION: SEQ ID NO: 7 

TGTCGACGGT GGCTAACTCT CTGAATGCAA 30 

SEQ ID NO: 8: 
SEQUENCE LENGTH: 1080 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA to bRNA 
75 FEATURE: 

NAME/KEY: CDS 
LOCATION: 1..1077 
IDENTIFICATION METHOD: E 
20 SEQUENCE DESCRIPTION: SEQ ID NO: 8 

ATG AAA GCG TTA CAG CTG CCA CCG ATT GAG AGA CTT ACG TTA ACG GCC 48 
Met Lys Ala Leu Gin Leu Pro Pro He Glu Arg Leu Thr Leu Thr Ala 
1 5 10 15 

AGG CAA CAA GAG GTG TTT GAT CTC ATC CGT GAT CAC ATC AGC CAG ACA 96 
Arg Gin Gin Glu Val Phe Asp Leu He Arg Asp His He Ser Gin Thr 
20 25 30 

3 o GGT ATG CCG CCG ACG CGT GCG GAA ATC GCG CAG CGT TTG GGG TTC CGT 144 

Gly Met Pro Pro Thr Arg Ala Glu He Ala Gin Arg Leu Gly Phe Arg 

35 40 45 

TCC CCA AAC GCG GCT GAA GAA CAT CTG AAG GCG CTG GCA CGC AAA GGC 192 
35 Ser Pro Asn Ala Ala Glu Glu His Leu Lys Ala Leu Ala Arg Lys Gly 

50 55 60 

GTT ATT GAA ATT GTT TCC GGC GCA TCA CGC GGG ATT CGT CTG TTG CAG 240 
Val He Glu He Yal Ser Gly Ala Ser Arg Gly lie Arg Leu Leu Gin 
65 70 75 80 

GAA GAG GAA GAA GGG TTG CCG CTG GTA GGT CGT GTG GCT GCC GGT GAA 288 
Glu Glu Glu Glu Gly Leu Pro Leu Val Gly Arg Val Ala Ala Gly Glu 

85 90 95 

CCA CTT CTG GCG CAA CAG CAT ATT GAA GGT CAT TAT CAG GTC GAT CCT 336 
Pro Leu Leu Ala Gin Gin His He Glu Gly His Tyr Gin Val Asp Pro 

100 105 110 

TCC TTA TTC AAG CCG AAT GCT GAT TTC CTG CTG CGC GTC AGC GGG ATG 384 
Ser Leu Phe Lys Pro Asn Ala Asp Phe Leu Leu Arg Val Ser Gly Met 
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15 



20 



30 



35 



40 



45 



115 120 125 

TCG ATG AAA GAT ATC GGC ATT ATG GAT GGT GAC TTG CTG GCA GTG CAT 432 
Ser Met Lys Asp He Gly He Met Asp Gly Asp Leu Leu Ala Val His 

130 135 140 

AAA ACT CAG GAT GTA CGT AAC GGT CAG GTC GTT GTC GCA CGT ATT GAT 480 
Lys Thr Gin Asp Val Arg Asn Gly Gin Val Val Val Ala Arg He Asp 
145 150 155 160 

GAC GAA GTT ACC GTT AAG CGC CTG AAA AAA CAG GGC AAT AAA GTC GAA 528 
Asp Glu Val Thr Val Lys Arg Leu Lys Lys Gin Gly Asn Lys Val Glu 

165 170 175 

CTG TTG CCA GAA AAT AGC GAG TTT AAA CCA An GTC GTT GAC CTT CGT 576 
Leu Leu Pro Glu Asn Ser Glu Phe Lys Pro He Val Val Asp Leu Arg 

180 185 190 

CAG CAG AGC TTC ACC ATT GAA GGG CTG GCG GTT GGG GTT ATT CGC AAC 624 
Gin Gin Ser Phe Thr lie Glu Gly Leu Ala Val Gly Val He Arg Asn 

195 200 205 

GGC GAC TGG CTG GAA TTC GCC AAT TTT AAT CAA AGT GGG AAT ATT GCT 672 
Gly Asp Trp Leu Glu Phe Ala Asn Phe Asn Gin Ser Gly Asn He Ala 

210 215 220 

GAT AGC TCA TTG TCC TTC ACT TTC ACT AAC AGT AGC AAC GT CCG AAC 720 
Asp Ser Ser Leu Ser Phe Thr Phe Thr Asn Ser Ser Asn Gly Pro Asn 
225 230 235 240 

CTC ATA ACA ACT CAA ACA AAT TCT CAA GCG CTT TCA CAA CCA ATT GCC 768 
Leu He Thr Thr Gin Thr Asn Ser Gin Ala Leu Ser Gin Pro He Ala 

245 250 255 

TCC TCT AAC GTT CAT GAT AAC TTC ATG AAT AAT GAA ATC ACG GCT AGT 816 
Ser Ser Asn Val His Asp Asn Phe Met Asn Asn Glu He Thr Ala Ser 

260 265 270 

AAA ATT GAT GAT GGT AAT AAT TCA AAA CCA CTG TCA CCT GGT TGG ACG 864 
Lys He Asp Asp Gly Asn Asn Ser Lys Pro Leu Ser Pro Gly Trp Thr 

275 280 285 

GAC CAA ACT GCG TAT AAC GCG TTT GGA ATC ACT ACA GGG ATG TTT AAT 912 
Asp Gin Thr Ala Tyr Asn Ala Phe Gly He Thr Thr Gly Met Phe Asn 

290 295 300 

ACC ACT ACA ATG GAT GAT GTA TAT AAC TAT CTA TTC GAT GAT GAA GAT 960 
Thr Thr Thr Met Asp Asp Val Tyr Asn Tyr Leu Phe Asp Asp Glu Asp 
305 310 315 320 
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ACC CCA CCA AAC CCA AAA AAA GAG ATC TCT ATG GCT TAC CCA TAC GAT 1008 
Thr Pro Pro Asn Pro Lys Lys Glu He Ser Met Ala Tyr Pro Tyr Asp 

325 330 335 

GTT CCA GAT TAC GCT AGC TTG GGT GGT CAT ATG GCC ATG GCG GCC GCT 1056 
Val Pro Asp Tyr Ala Ser Leu Gly Gly His Met Ala Met Ala Ala Ala 

340 345 350 

CGA GTC GAC CTG CAG CCA AGC TAA 1080 
Arg Val Asp Leu Gin Pro Ser 
355 

SEQ ID NO: 9: 

SEQUENCE LENGTH: 1152 

SEQUENCE TYPE: nucleic acid 

STRANDEDNESSS: double 

TOPOLOGY: linear 

MOLECULE TYPE: cDNA to mRNA 

FEATURE: 

NAME/KEY: CDS 

LOCATION: 1..1149 

IDENTIFICATION METHOD: S 
SEQUENCE DESCRIPTION: SEQ ID NO: 9 

ATG GCT AAG ATC TCT CCC GGG CTC GAG CTC ATG AAG AAG GGT ACT CAG 48 

Met Ala Lys He Ser Pro Gly Leu Glu Leu Met Lys Lys Gly Thr Gin 

1 5 10 15 

CGT CTA TCC CGC CTG TTC GCC GCG ATG GCC ATT GCC GGG TO GCC AGC 96 

Arg Leu Ser Arg Leu Phe Ala Ala Met Ala lie Ala Gly Phe Ala Ser 

20 25 30 

TAC TCC ATG GCC GCC GAC ACC ATC AAG ATC GCC CTG GCT GGC CCG GTC 144 
Tyr Ser Met Ala Ala Asp Thr He Lys He Ala Leu Ala Gly Pro Val 

35 40 45 

ACC GGT CCG GTA GCC CAG TAC GGC GAC ATG CAG CGC GCC GGT GCG CTG 192 
Thr Gly Pro Val Ala Gin Tyr Gly Asp Met Gin Arg Ala Gly Ala Leu 

50 55 60 

ATG GCA ATC GAA CAG ATC AAC AAG GCA GGC GGC GTG AAC GGC GCG CAA 240 
Met Ala lie Glu Gin lie Asn Lys Ala Gly Gly Val Asn Gly Ala Gin 
65 70 75 80 

CTC GAA GGC GTG ATC TAC GAC GAC GCC TGC GAT CCC AAG CAG GCC GTG 288 
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1S 



so 



25 



30 



35 



40 



45 



Leu Glu Gly Val He Tyr Asp Asp Ala Cys Asp Pro Lys Gin Ala Val 

85 90 95 

GCG GTC GCC AAC AAG GTG GTC AAC GAC GGC GTC AAG TIC GTG GTC GG1 336 
Ala Val Ala Asn Lys Val Val Asa Asp Gly Val Lys Phe Val Val Gly 

100 105 110 

CAT GTC TGC TCC AGC TCC ACC CAA CCC GCC ACC GAC ATC TAC GAA GAC 384 
Bis Val Cys Ser Ser Ser Tor Gla Pro Ala Thr Asp lie Tyr Glu Asp 

115 120 125 

GAA GGC GTG CTG ATG ATC ACC CCG TCG GCC ACC GCC CCG GAA ATC ACC 432 
Glu Gly Val Leu Ket He Thr Pro Ser Ala Tor Ala Pro Glu He Thr 
130 135 140 

' TCG CGC GGC TAC AAG CTG ATC TTC CGC ACC ATC GGC CTG GAC AAC ATG 480 
Ser Arg Gly Tyr Lys Leu He Phe Arg Thr He Gly Leu Asp Asn Met 
145 150 155 160 

CAG GGC CCG GTG GCC GGC AAG TTC ATC GCC GAA CGC TAC AAG GAC AAG 528 
Gin Gly Pro Val Ala Gly Lys Phe He Ala Glu Arg Tyr Lys Asp Lys 

165 170 175 

ACC ATC GCG GTA CTG CAC GAC AAG CAG CAG TAC GGC GAA GGC ATC GCC 576 
Thr He Ala Val Leu His Asp Lys Gin Gin Tyr Gly Glu Gly He Ala 
180 185 190 

ACC GAG GTG AAG AAG ACC GTG GAA GAC GCC GGC ATC AAG GTT GCC GTC 624 
Thr Glu Val Lys Lys Thr Val Glu Asp Ala Gly He Lys Val Ala Val 

195 200 205 

TTC GAA GGC CTG AAC GCC GGC GAC AAG GAC TTC AAC GCG CTG ATC AGC 672 
Phe Glu Gly Leu Asn Ala Gly Asp Lys Asp Phe Asn Ala Leu He Ser 

210 215 220 

AAG CTG AAG AAA GCC GGC GTG CAG TTC GTC TAC TTC GGC GGC TAC CAC 720 
Lys Leu Lys Lys Ala Gly Val Gin Phe Val Tyr Phe Gly Gly Tyr His 
225 230 235 240 

CCA GAA ATG GGC CTG CTG CTG CGC CAG GCC AAG CAG GCC GGG CTG GAC 768 
Pro Glu Met Gly Leu Leu Leu Arg Gin Ala Lys Gin Ala Gly Leu Asp 

245 250 255 

GCG CGC TTC ATG GGC CCG GAA GGG GTC GGC AAC AGC GAA ATC ACC GCG 816 
Ala Arg Phe Met Gly Pro Glu Gly Val Gly Asn Ser Glu He Thr Ala 

260 265 270 

ATC GCC GGC GAC GCT TCG GAA GGC ATG CTG GCG ACC CTG CCG CGC GCC 864 
lie Ala Gly Asp Ala Ser Glu Gly Met Leu Ala Thr Leu Pro Arg Ala 
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275 280 285 

TTC GAG CAG GAT CCG AAG AAC AAG GCC CTG ATC GAC GCC TTC AAG GCG 912 
Phe Glu Gin Asp Pro Lys Asa Lys Ala Leu lie Asp Ala Phe Lys Ala 

290 295 300 

AAG AAC CAG GAT CCG AGC GGC ATC TTC GTC CTG CCC GCC TAC TCC GCG 960 
ro Lys Asn Gin Asp Pro Ser Gly He Phe Val Leu Pro Ala Tyr Ser Ala 

305 310 315 320 

GTC ACA GTG ATC GCC AAG GGC ATC GAG AAA GCC GGC GAG GCC GAT CCG 1008 
Val Thr Val He Ala Lys Gly He Glu Lys Ala Gly Glu Ala Asp Pro 

325 330 335 

GAG AAG GTC GCC GAG GCC CTG CGC GCC AAC ACC TTC GAG ACT CCC ACC 1056 
Glu Lys Val Ala Glu Ala Leu Arg Ala Asn Thr Phe Glu Thr Pro Thr 

340 345 350 

GGG AAC CTC GGG TTC GAC GAG AAG GGC GAC CTG AAG AAC TTC GAC TTC 1104 
Gly Asn Leu Gly Phe Asp Glu Lys Gly Asp Leu Lys Asn Phe Asp Phe 
355 360 365 

2S ACC GTC TAC GAG TGG CAC AAG GAC GCC ACC CGG ACC GAG GTC AAG 1149 

Thr Val Tyr Glu Trp His Lys Asp Ala Thr Arg Thr Glu Val Lys 

370 375 380 

TAA 1152 

30 

SEQ ID NO: 10: 
SEQUENCE LENGTH: 12 
SEQUENCE TYPE: aaino acid 
TOPOLOGY : linear 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION: SEQ ID NO: 10 
40 Ser Glu Pro Pro Lys Lys Lys Arg Lys Val Glu Thr 

1 5 10 



15 



SO 
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45 



50 



55 



SEQ ID NO: 11: 
SEQUENCE LENGTH: 37 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 11 
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CTAGCGAGCC TCCAAAAAAG AAGAGAAAGG TCGAAAC 37 

SEQ ID NO: 12: 
SEQUENCE LENGTH: 37 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 12 
15 GCTCGGAGGT TTTTTCTTCT CTTTCCAGCT TTGGTAC 37 

SEQ ID NO: 13: 
SEQUENCE LENGTH: 419 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: double 
TOPOLOGY: linear 
25 MOLECULE TYPE: genomic DNA 

SEQUENCE DESCRIPTION: SEQ ID NO: 13 

TCGACTGCTG TATATAAAAC CAGTGGTTAT ATGTACAGTA CTGCTGTATA TAAAACCAGT 60 
GGTTATATGT ACAGTACGTC GAGGGAATCA AATTAACAAC CATAGGATGA TAATGCGATT 120 
30 AGTTTTTTAG CCTTATTTCT GGGGTAATTA ATCAGCGAAG CGATGATTTT TGATCTATTA 180 

ACAGATATAT AAATGCAAAA ACTGCATAAC CACTTTAACT AATACTTTCA ACATTTTCGG 240 
TTTGTATTAC TTCTTATTCA AATGTAATAA AAGTATCAAC AAAAAATTGT TAATATACCT 300 
CTATACTTTA ACGTCAAGGA GAAAAAACTA TAATGACTAA ATCTCATTCA GAAGAAGTGA 360 
TTGTACCTGA GTTCAATTCT AGCGCAAAGG AATTACCAAG ACCATTGGCC GAAAAGTGC 419 



so 



35 



SEQ ID NO: 14: 
40 SEQUENCE LENGTH: 12 

SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 14 

AATTGACCAC CC 12 



45 



SO 



55 



SEQ ID NO: 15: 
SEQUENCE LENGTH: 12 
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SEQUENCE TYPE: nucleic acid 
STEANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 15 
CTGGTGGGTT AA 

SEQ ID NO: 16: 
SEQUENCE LENGTH: 25 
SEQUENCE TYPE: nucleic acid 
STEANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 16 
CTAGCTTGG6 TGGAATTCAT ATGGC 

SEQ ID NO: 17: 
SEQUENCE LENGTH: 24 
SEQUENCE TYPE: nucleic acid 
STEANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 17 
GAACCCACCT TAAGTATACG GTAC 

SEQ ID NO: 18: 
SEQUENCE LENGTH: 11 
SEQUENCE TYPE: nucleic acid 
STEANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 18 
CTGCATGCAC C 

SEQ ID NO: 19: 
SEQUENCE LENGTH: 14 
SEQUENCE TYPE: nucleic acid 
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STBANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 19 
ATG6ACGTAC GTGG 

SEQ ID NO: 20: 
SEQUENCE LENGTH: 32 
SEQUENCE TYPE: nucleic acid 
STBANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 20 
CTATTCGATG ATGAAGATAC CCCACCAAAC CC 

SEQ ID NO: 21: 
SEQUENCE LENGTH: 30 
SEQUENCE TYPE: nucleic acid 
STBANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 20 
GAAATTCGCC CGGAATTAGC TTGGCTGCAG 

SEQ ID NO: 22: ■ 
SEQUENCE LENGTH: 32 
SEQUENCE TYPE: nucleic acid 
STBANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 22 
CTATTCGATG ATGAAGATAC CCCACCAAAC CC 

SEQ ID NO: 23: 
SEQUENCE LENGTH: 30 
SEQUENCE TYPE: nucleic acid 
STBANDEDNESSS: single 



26 



EP 0 995 797 A1 



TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 23 

GAAATTCGCC CGGAATTAGC TTGGCTGCAG 30 

SEQ ID NO: 24: 
SEQUENCE LENGTH: 3Z 
SEQUENCE TYPE: nucleic acid 
STRAND EDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 24 

TTTCAATTGG AATGGACGAA CTGTTCCCCC TC 32 

SEQ ID NO: 25: 
SEQUENCE LENGTH: 35 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 25 

GCGCAGCGAG TCAGTGAGCG AGGAAGCGGA AGAGG 35 

SEQ ID NO: 26: 
SEQUENCE LENGTH: 35 
SEQUENCE TYPE: nucleic acid 
STRANDEDNESSS: single 
TOPOLOGY: linear 

MOLECULE TYPE: other nucleic acid, synthetic DNA 
SEQUENCE DESCRIPTION: SEQ ID NO: 26 

TTTGAATTCT AATGATGTTC TCGGGTTTCA ACGCG 35 



Claims 

1 . A method for detecting nuclear transportability of a peptide encoded by a test DNA, the method comprising intro- 
ducing a fusion DNA formed by a DNA encoding a transcription factor without nuclear transportability and the test 
DNA into an eukaryotic host having in its nucleus a promoter region that is activated when said transcription factor 
binds thereto and a report r gene connected to the downstream of said promot r region, and detecting expression 
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of said reporter gen . 

2. The method of claim 1 . wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a DNA binding domain, and a transcription activating domain. 

5 

3. The method of claim 1 , wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain, and the promoter region 
activated when said transcription factor binds thereto is that of a GAL1 gene whose operator sequence is replaced 
with that of LexA. 

10 

4. The method of claim 3, wherein the nuclear export signal is a peptide comprising the amino acid sequence set forth 
in SEQ ID NO; 5. 

5. The method of any one of claims 1 to 4, wherein the reporter gene is LEU2 and/or a p-galactosidase gene. 

15 

6. A method for isolating a DNA encoding a peptide with nuclear transportability, the method comprising introducing 
a fusion DNA formed by a DNA encoding a transcription factor without nuclear transportability and a test DNA into 
a eukaryotic host having in its nucleus a promoter region activated when said transcription factor binds thereto and 
a reporter gene connected downstream of said promoter region, detecting the expression of said reporter gene, 

20 and isolating the test DNA from the eukaryotic host in which the expression has been detected. 

7. The method of claim 6, wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a DNA binding domain, and a transcription activating domain. 

25 8. The method of claim 6, wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain, and the promoter region 
activated when said transcription factor binds thereto is that of a GAL1 gene whose operator sequence is replaced 
with that of LexA. 

30 9. The method of claim 8, wherein the nuclear export signal is a peptide comprising the amino acid sequence set forth 
in SEQ ID NO: 5. 

10. The method of any one of claims 6 to 9, wherein the reporter gene is LEU2 and/or a p-galactosidase gene. 

35 1 1 . A vector comprising a DNA encoding a transcription factor without nuclear transportability and an introduction site 
for a test DNA adjacent thereto. 

12. The vector of claim 1 1 , wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a DNA binding domain, and a transcription activating domain. 

40 

13. The vector of claim 1 1 , wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain. 

14. The vector of claim 13, wherein the nuclear export signal is the peptide comprising the amino acid sequence set 
45 forth in SEQ ID NO: 5. 

15. A kit comprising (i) a vector comprising a DNA encoding a transcription factor without nuclear transportability and 
an introduction site for a test DNA adjacent thereto, and (ii) a eukaryotic host having in its nucleus an expression 
unit comprising a promoter region activated when said transcription factor binds thereto and a reporter gene con- 

50 nected to the downstream of said promoter region. 

16. The kit of claim 15, wherein the transcription factor without nuclear transportability is a fusion protein comprising a 
nuclear export signal, a DrjJA binding domain, and a transcription activating domain. 

55 1 7. The method of claim 1 5, wherein the transcription factor without nuclear transportability is a fusion protein compris- 
ing a nuclear export signal, a LexA protein, and a GAL4-transcription activating domain, and the promoter region 
activated when said transcription factor binds thereto is that of a GAL1 gene whose operator sequence is replaced 
with that of LexA, and the eukaryotic host is yeast. 
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18. The kit of claim 1 7, wherein the nuclear export signal is a peptide comprising th amino acid sequence set forth in 
SEQ ID NO: 5. 

19. The kit of any one of claims 15 to 18, wherein the reporter gene is LEU2 and/ r a p-galactosidase gene. 

5 



10 



15 



20 



25 



30 



35 



40 



50 



55 



29 



EP 0 995 797 A1 



Fig. 1 
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Fig. 3 
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Fig. 7 
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